-
Notifications
You must be signed in to change notification settings - Fork 60
Fix segmentation fault and calculation error in AveragePool2dKernel #2091
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes critical issues in the AveragePool2dKernel implementation for XPU devices, specifically addressing a segmentation fault and calculation error that were causing test failures.
- Replaced direct index access with XPU_KERNEL_LOOP macros for safer kernel iteration
- Changed data types from index_t to int64_t for better overflow handling
- Added group_size limit to prevent exceeding hardware capabilities
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check if test_avg_pool_large_tensor2_xpu
is activated in CI.
The rest of this PR looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One nit, otherwise, LGTM.
Fixed the following issues found by test/test_nn.py::TestNNDeviceTypeXPU::test_avg_pool_large_tensor2_xpu